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Abstract. We consider Markov decision processes (MDPs) with w-regular specifications given as parity objectives. 
We consider the problem of computing the set of almost-sure winning vertices from where the objective can be 
ensured with probability 1. The algorithms for the computation of the almost-sure winning set for parity objectives 
iteratively use the solutions for the almost-sure winning set for Biichi objectives (a special case of parity objectives). 
We study for the first time the average case complexity of the classical algorithm for computing almost-sure winning 
vertices for MDPs with Biichi objectives. Our contributions are as follows: First, we show that for MDPs with 
constant out-degree the expected number of iterations is at most logarithmic and the average case running time is 
linear (as compared to the worst case linear number of iterations and quadratic time complexity). Second, we show 
that for general MDPs the expected number of iterations is constant and the average case running time is linear 
(again as compared to the worst case linear number of iterations and quadratic time complexity). Finally we also 
show that given all graphs are equally likely, the probability that the classical algorithm requires more than constant 
number of iterations is exponentially small. 

1 Introduction 

Markov decision processes. Markov decision processes (MDPs) are standard models for probabilistic systems that ex- 
hibit both probabilistic and nondeterministic behavior [13], and widely used in verification of probabilistic systems [1]. 
MDPs have been used to model and solve control problems for stochastic systems [12]: there, nondeterminism rep- 
resents the freedom of the controller to choose a control action, while the probabilistic component of the behavior 
describes the system response to control actions. MDPs have also been adopted as models for concurrent probabilistic 
systems [7], probabilistic systems operating in open environments [17], and under-specified probabilistic systems [2]. 
A specification describes the set of desired behaviors of the system, which in the verification and control of stochastic 
systems is typically an w-regular set of paths. The class of w-regular languages extends classical regular languages 
to infinite strings, and provides a robust specification language to express all commonly used specifications, such as 
safety, liveness, fairness, etc [19]. Parity objectives are a canonical way to define such oj-regular specifications. Thus 
MDPs with parity objectives provide the theoretical framework to study problems such as the verification and control 
of stochastic systems. 

Qualitative and quantitative analysis. The analysis of MDPs with parity objectives can be classified into qualitative 
and quantitative analysis. Given an MDP with parity objective, the qualitative analysis asks for the computation of 
the set of vertices from where the parity objective can be ensured with probability 1 (almost-sure winning). The 
more general quantitative analysis asks for the computation of the maximal probability at each state with which the 
controller can satisfy the parity objective. 

Importance of qualitative analysis. The qualitative analysis of MDPs is an important problem in verification that 
is of interest irrespective of the quantitative analysis problem. There are many applications where we need to know 
whether the correct behavior arises with probability 1 . For instance, when analyzing a randomized embedded sched- 
uler, we are interested in whether every thread progresses with probability 1 [9]. Even in settings where it suffices 
to satisfy certain specifications with probability p < 1, the correct choice of p is a challenging problem, due to the 
simplifications introduced during modeling. For example, in the analysis of randomized distributed algorithms it is 
quite common to require correctness with probability 1 (see, e.g., [15, 14, 18]). Furthermore, in contrast to quantitative 
analysis, qualitative analysis is robust to numerical perturbations and modeling eiTors in the transition probabilities, 



and consequently the algorithms for qualitative analysis are combinatorial. Finally, for MDPs with parity objectives, 
the best known algorithms and all algorithms used in practice first perform the qualitative analysis, and then perform 
a quantitative analysis on the result of the qualitative analysis [7, 8, 6]. Thus qualitative analysis for MDPs with parity 
objectives is one of the most fundamental and core problems in verification of probabilistic systems. 

Previous results. The qualitative analysis for MDPs with parity objectives is achieved by iteratively applying solutions 
of the quaUtative analysis of MDPs with Biichi objectives [7, 8, 6]. The quaUtative analysis of an MDP with a parity 
objective with d priorities can be achieved by 0{d) calls to an algorithm for qualitative analysis of MDPs with Biichi 
objectives, and hence we focus on the qualitative analysis of MDPs with Biichi objectives. The qualitative analysis 
problem for MDPs with Biichi objectives has been widely studied. The classical algorithm for qualitative analysis for 
MDPs with Biichi objectives was given in [7, 8], and the worst case running time of the classical algorithm is 0{n ■ m) 
time, where n is the number of vertices, and m is the number of edges of the MDP. Many improved algorithms 
has also been given in literature, such as [5, 3, 4], and the current best known worst case complexity of the problem is 
0(min{ n^, TO • y^})- While the worst case complexity of the problem has been studied, to the best of our knowledge 
the average case complexity of none of the algorithms has been studied in literature. 

Our contribution. In this work we study for the first time the average case complexity of the quaUtative analysis 
of MDPs with Biichi objectives. Specifically we study the average case complexity of the classical algorithm for the 
following two reasons: (1) the classical algorithm is very simple and appealing as it iteratively uses solution of the 
standard graph reachabiUty and alternating graph reachability algorithms; and (2) for the more involved improved 
algorithms it has been established that there are variants of the improved algorithms that never require more than 
linear time as compared to the classical algorithm. We study the average case complexity of the classical algorithm 
and establish that as compared to the quadratic worst case complexity, the average case complexity is linear. Our main 
contributions are summarized below: 

1. MDPs with constant out-degree. We first consider MDPs with constant out-degree. In practice, MDPs often have 
constant out-degree: for example, see [10] for MDPs with large state space but constant number of actions, or [12, 
16] for examples from inventory management where MDPs have constant number of actions (the number of 
actions correspond to the out-degree of MDPs). We show that for MDPs with constant out-degree, the expected 
number of iterations of the classical algorithm is at most logarithmic, and the average case running time is linear 
(as compared to the worst case linear number of iterations and quadratic time complexity). 

2. MDPs in the Erdos-Renyi model. To consider the average case complexity of the general case, we consider MDPs 
where the underlying graph is a random graph according to the classical Erdos-Renyi random graph model [11]. 
We consider random graphs Qn,p, over n vertices where each edge exists with probability p (independently of 
other edges). We show that if p > '^'^"s^") ^ for any constant c > 2, then the expected number of iterations of the 
classical algorithm is constant, and the average case running time is linear (again as compared to the worst case 
linear number of iterations and quadratic time complexity). Note that we obtain that the average case (p = |) 
running time for the classical algorithm is linear as a special case of our results for p > £i2ll!ii^ for any constant 

1 

c > 2. Moreover we show that when p = ^ (i.e., all graphs are equally hkely), the probability that the classical 
algorithm will require more than constant iterations is exponentially small (less than (f )")• 
In summary our results show that the classical algorithm (the most simple and appeahng algorithm) has excellent 
(linear-time) average case complexity as compared to the quadratic worst case complexity. As a consequence of our 
result it also follows that the improved algorithms also have linear time average case complexity, and thus we complete 
the average case analysis of the algorithms for the qualitative analysis of MDPs with Biichi objectives. 

Technical contributions. The two key technical difficulties to establish our results are as follows: (1) Though there are 
many results for random undirected graphs, for the average case analysis of the classical algorithm we need to analyze 
random directed graphs; and (2) in contrast to other results related to random undirected graphs that proves results for 
almost all vertices, the classical algorithm stops when all vertices satisfy a certain reachability property; and hence we 
need to prove results for all vertices (as compared to almost all vertices). In this work we set up novel recurrence rela- 
tions to estimate the expected number of iterations, and the average case running time of the classical algorithm. Our 
key technical results prove many interesting inequalities related to the recurrence relation for reachability properties 
of random directed graphs to establish the desired result. We believe the new interesting results related to reachabil- 
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ity properties we establish for random directed graphs will find future applications in average case analysis of other 
algorithms related to verification. 

2 Definitions 

Markov decision processes (MDPs). A Markov decision process (MDP) G = {{V,E),{Vi,Vp),S) consists of a 
directed graph {V, E), a partition {Vi,Vp) of the finite set V of vertices, and a probabilistic transition function 6: 
Vp T^{y), where T>{V) denotes the set of probability distributions over the vertex set V . The vertices in Vi are the 
player-1 vertices, where player 1 decides the successor vertex, and the vertices in Vp are the probabilistic (or random) 
vertices, where the successor vertex is chosen according to the probabilistic transition function S. We assume that for 
u € Vp and v E V,we have (u, v) E E iff S{u){v) > 0, and we often write S(u, v) for S{u){v). For a vertex v €V, 
we write E{v) to denote the set { u e ^ | {v,u) £ E } of possible out-neighbors. For technical convenience we 
assume that every vertex in the graph {V, E) has at least one outgoing edge, i.e., E{v) ^ for all v e V. 

Plays, strategies and probability measure. An infinite path, or a play, of the game graph G is an infinite sequence 

ijj = {vo,vi,V2, ■ ■ ■) of vertices such that (ffc, f/c+i) G E for all k E N. We write J7 for the set of all plays, and 
for a vertex v E V, wc write J7„ C i7 for the set of plays that start from the vertex v. A strategy for player 1 is a 
function a: V*-Vi T^{V) that chooses the probabiUty distribution over the successor vertices for all finite sequences 
w E V* -Vi of vertices ending in a player-1 vertex (the sequence represents a prefix of a play). A strategy must respect 
the edge relation: for all w E V* and u E V\, if ai^a ■ u)(v) > 0, then v E E{u). Once a starting vertex v E V 
and a strategy a E S is fixed, the outcome of the MDP is a random walk uj^ for which the probabilities of events are 
uniquely defined, where an event A Q J? is a measurable set of plays. For a vertex v E V and an event ^ C i?, we 
write {A) for the probability that a play belongs to A if the game starts from the vertex v and player 1 follows the 
strategy a. 

Objectives. We specify objectives for the player 1 by providing a set of winning plays <P C H. We say that a play 
LO satisfies the objective # if a; G We consider co-regular objectives [19], specified as parity conditions. We also 
consider the special case of Biichi objectives. 

- Biichi objectives. Let i? be a set of Biichi vertices. For a play oj = (uq, f i, ■ • ■) € f2, we define Inf (w) = {v E 
V \ Vk = V for infinitely many fc } to be the set of vertices that occur infinitely often in lj. The Biichi objectives 
require that some vertex of B be visited infinitely often, and defines the set of winning plays Biichi(i3) = {uj E 
n I Inf (w) n B ^ }. 

- Parity objectives. For c, d G N, we write [c..d\ = { c, c -(- 1, . . . , d }. Let p: y — >■ [0..d] be a function that assigns 
a priority p{v) to every vertex v E V, where d G N. The parity objective is defined as Parity(p) — { u> E 
f2 I min (p(Inf(a;))) is even }. In other words, the parity objective requires that the minimum priority visited 
infinitely often is even. In the sequel we will use ^ to denote parity objectives. 

Qualitative analysis: almost-sure winning. Given a player-1 objective a strategy a E S is almost-sure winning 
for player 1 from the vertex v if PJ(^) = 1. The almost-sure winning set ((l))aimos«(^) for player 1 is the set of 
vertices from which player 1 has an almost-sure winning strategy. The qualitative analysis of MDPs correspond to the 
computation of the almost-sure winning set for a given objective 

Algorithm for qualitative analysis. The almost-sure winning set for MDPs with parity objectives can be computed 

using 0{d) calls to compute the almost-sure winning set of MDPs with Biichi objectives [6-8]. Hence we focus on the 
qualitative analysis of MDPs with Biichi objectives. The algorithms for qualitative analysis for MDPs do not depend 
on the transition function, but only on the graph G = ((V, E), (Vi, Vp)). We now describe the classical algorithm for 
the qualitative analysis of MDPs with Biichi objectives and the algorithm requires the notion of random attractors. 

Random attractor. Given an MDP G, let C/ C y be a subset of vertices. The random attractor Attrp{U) is defined 
inductively as follows: Xq = U, and for i > 0, let Xj+i = Xi U { v E Vp | E{v) r\XijtlD}U{vEVi\ 
E{v) C Xi }. In other words, Xj+i consists of (a) vertices in X;, (b) player-1 vertices whose all successors are in Xi 
and (c) probabihstic vertices that have at least one edge to X^. Then Attrp{U) = Ui>o -^i- Observe that the random 
attractor is equivalent to the alternating reachabiUty problem (reachability in AND-OR graphs). 

Classical algorithm. The classical algorithm for MDPs with Biichi objectives is a simple iterative algorithm, and 
every iteration uses graph reachability and alternating graph reachabihty (random attractors). Let us denote the MDP 
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in iteration i by with vertex set V. Then in iteration i the algorithm executes the following steps: (i) computes the 
set of vertices that can reach the current set of Btichi vertices; (ii) letW = V^\ be the set of remaining vertices; if 
J7» is empty, then the algorithm stops and outputs Z^ as the set of almost-sure winning vertices, and otherwise removes 
Attrp{W) from the graph, and continues to iteration i + 1. The classical algorithm requires at most 0{n) iterations, 
where n = \V\, and each iteration requires at most 0{m) time, where m = Moreover the above analysis is tight, 
i.e., there exists a family of MDPs where the classical algorithm requires 0{n) iterations, and total time 0{n ■ m). 
Hence 0{n ■ m) is the tight worst case complexity of the classical algorithm for MDPs with Biichi objectives. In this 
work we consider the average case analysis of the classical algorithm. 

3 Average Case Analysis for MDPs with Constant Out-degree 

In this section we consider the average case analysis of the number of iterations and the running time of the the 
classical algorithm for computing the almost-sure winning set for MDPs with Biichi objectives on family of graphs 
with constant out-degree. 

Family of graphs and results. We consider graphs with vertex set of size n (i.e., \V\ = n) and the target set of Biichi 
vertices is of size t (i.e., \B\ = t). Every vertex v has out-degree dy such that dmin < < dmax, for given constants 
<iinin and dmax- Morcovcr every set of neighbors of size dy are equally likely and independent of neighbors of other 
vertices. We will show the following for this family of graphs: (a) if the target set B has size more than 30 • a; • log(n) 
(i.e., t > 30 ■ ■ log(n)), then the expected number of iterations is 0(1) and the average running time is 0(n); and 
(b) if the target vertex set B has size at most 30 • a; • log(n), then the expected number of iterations required is at most 
0(log(n)) and average running time is 0(n), where x is the number of distinct degrees. 

Notations. We use n and t for the total number of vertices and the size of the target set, respectively. We will denote 
by X the number of distinct out-degree d^'s, and let di, for 1 < i < x be the distinct out-degrees. Since for all vertices 
V we have dmin < dy < dmax, it follows that we have x < dmax — dmin + 1. Let be the number of vertices with 
degree d, and U be the number of target (Biichi) vertices with degree d,. 

The event R{ki, k2,.-.,kx). The reverse reachable set of the target set B is the set of vertices u such that there is a 
path in the graph from u to a vertex v E B. Let R(ki ,k2,...,kx) denote the probability of the event that all vertices of 
any given set S comprising of ki vertices of degree di, for 1 < i < x can reach B via a path that lies entirely in S. 
Note that this probability only depends on ki, 1 < i < x due to the symmetry between vertices. We will investigate 
the reverse reachable set of B, which contains B itself. Recall that vertices in B have degree d,, and hence we are 
interested in the case when ki > U for all 1 < i < x. 

Consider a set S of vertices that is the reverse reachable set, and let S be composed of ki vertices of degree dj and 
of size k, i.e., k =\S\ = fcj- Since S is the reverse reachable set, it follows that for all vertices f in V \ S*, there 

is no edge from to a vertex in S (otherwise there would be a path from from u to a target vertex and then v would 
belong to S). Thus there are no incoming edges from V \ S to S. Thus for each vertex v oiV \ S, dil its neighbors 
must lie in F \ 5 itself. This happens with probability 



since in F \ S* there are — ki vertices with degree di and the size ofV\S\sn — k. The probabiUty that each vertex 
in S can reach a target vertex is R{ki,k2, k^). Hence the probability of S being the reverse reachable set is given 





There are Y[i=i (fe'-tO possible ways of choosing ki > ti vertices (since the target set is contained) out of a^. The 
value k can range from t to n and exactly one these subsets of V will be the reverse reachable set. So the sum of 
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probabilities of this happening is 1. Hence we have: 





i=E E n(z:::)-(wi w 

k=t J2 l'i=k,ti<ki<ai \i=l 

Let 

ki=k,ti<ki<ai 

Our goal is to show that for 30 • a; • log(n) < A; < n — 1, the value of afe is very small; i.e., we want to get an upper 

(\ ai—ki 
i"d- ) /(d ) ) ^(^1- ^2. • • • , fcx)- Below we get an 

upper bound for both of them. Firstly note that when k is small, for any set S comprising of ki vertices of degree di 
for 1 < i < a; and 15*1 = k, the event R{k\,k2, . . . ,kx) requires each non-target vertex of S to have an edge inside 
S. Since k is small and all vertices have constant out-degree spread randomly over the entire graph, this is highly 
improbable. We formalize this intuitive argument in the following lemma. 



Lemma 1 (Upper bound on R{ki ,k2,..., k^)). We have 



di\^' ^' X / J . i„ \ ki—ti 



M^n(^-0-^)j ^n(^) 

Proof. Let S be the given set comprising of fcj vertices of degree di, for 1 < i < a;. Then for every non-target vertex 
of 5, for it to be reachable to a target vertex via a path in S, it must have atleast one edge inside S. This gives the 
following upper bound on R{k\,k2, kx). 

X / /n—k\ \ ki—ti 

R{kuk2,..M<Jl[l ^"^^^ 



id) 



We have the following inequaUty for aU dj, 1 < i < a;: 



3=0 



n — j J \ n — di J n — d. 



The first inequality follows by replacing j with rf,; > j, and the second inequality follows from standard binomial 
expansion. Using the above inequality in the bound for R{ki,k2, ■ • ■ , kx) we obtain 



The result foUows. I 
Note that we have a loose and a strict bound on R{ki,k2, • • • , kx). We use the loose bound whenever it is sufficient 

/ , \ ai — ki 

and switch to the strict bound whenever required. Now for ( ("^ ) /(d ) ) > have the following trivial upper 
bound. 

I j \ ^ /-, fe\di(ai-fci) 



Lemma 2. We have {^^^ < (l - t)* 
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Proof. We have 



^ n, 



The inequality follows since j >0 and we replace j by in the denominator. The result follows. 

Next we simplify the expression of by taking care of the summation. 
Lemma 3. The probability that the reverse reachable set is of size exactly k is a^, and 

ak<n^ ■ max ak^^k^,...,^- 

2_ ki=k,ti<ki<ai 

Proof. The probability that the reverse reachable set is of size exactly k is given by 
(refer to Equation 1). Since 



ki=k,ti<ki<ai 

and there are x distinct degree's and n vertices, the number of different terms in the summation is at most n^. Hence 

afe < n'' • max afei,fe2,...,fex- 

ki=k,ti<ki<ai 

The desired result follows. I 

Now we proceed to achieve an upper bound on aki,k2,...,kx- As argued before if k is small, then R{k\,k2, . . . , fcx) 
is very small. On the other hand, consider the case when k is very large. In this case there are very few vertices that 
cannot reach the target set. Hence they must have all their edges within them, which is again highly improbable. Note 
that different factors bind au depending on whether k is small or large. This suggests we should consider these cases 
separately. Our proof will consist of the following case analysis of the size k of the reverse reachable set: (1) when 
30 • X • log(n) < k < ci-nis small (for some constant Ci); (2) when Ci • n < fc < C2 • n is large (for constants C2 > ci); 
and (3) when C2 - n <k <n — dmin — 1 is very large. The analysis of the constants will follow from the proofs. Note 
that since the target set B (with \B\ = Q is a subset of its reverse reachable set, we have k <tis infeasible. Hence in 
all the three cases, we will orily consider k>t. We first consider the case when k is small. 



3.1 Small A;: 30 • £c • log(n) < k < cin 

In this section we will consider the case when 30 • a; • log(n) < k < ci ■ n for some constant ci > 0. Note that this case 
only occurs when t < ci ■ n. We will assume this throughout this section. We will prove that there exists a constant 
ci > such that for all 30 • a; • log(n) < k < ci ■ n the probability (ak) that the size of the reverse reachable set is k 
is bounded by Note that we already have a bound on ak in terms of aki,k2,...,k^. Next we convert afei^fe2,- -,fex ^^^^ 
a form that is easy to analyze. Let 

Lemma 4. We have aki,k2,...,kx < bki,k2,-,kx- 
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Proof. We have 



(, \ \ — "til / Jl T \ — ^' 



= 1 



The first inequahty follows from Lemma 1 and Lemma 2. The second inequality follows from the first inequality of 
Proposition 1 (in technical appendix) and the fact that 1 — x < e~^. I 



Maximum of bki,k2,..;kx- Next we show that bki,k2,---,k^ drops exponentially as a function of k. Note that this is the 
reason for the logarithmic lower bound on k in this section. To achieve this we consider the maximum possible value 

achievable by bki,k2 fc^ ■ Let 9fe;6fej.fe2 k^ denote the change in &fei,A;2,....fe^ due to change in fc^. Since ^^^-^ ki ~ k 

is fixed, it follows that bki^k2,...,k^ is maximum when for all i and j we have dkibki,k2,--.,k^ = dkjbki,k2,...,k^- We have 

dkibki,k2,-,ka, = bki,k2,-,ka, ' ( + log ( ) + lOg 



^ ^j^max / \ki ti 



Thus, for maximizing bki,k2,...,kx' for all i and j we must have 



di ■ k , / di ■ k \ , f —ti\ dj ■ k , / dn ■ k \ , f — tj 
^ + log^- +log =^+log^^ +logU^^ 

Aj^ /■^ ^3 ^3 



{a^ - ti) ■ di ■ e'^*-'=/" (oj - tj) ■ dj ■ e'^rk/n 
This imphes that for all i we have 



^2 A^ ^ 



Lemma 5. Let L = (a^ - U) ■ di ■ e^'-''/'^. We have 

0ki,k2,-,kx ^ I 1 • 1 e 



1 



Proof. The argument above shows that the maximum of bki,k2,---,ka, is achieved when for all 1 < i < a; we have 
ki-ti= ^ {k - t). Now, plugging the values in bki,k2,...,ka,' we get 



... ki—ti / J i„ \ ki—ti 



i=l 



n 



(Rearranging denominators of first and third term, gathering powers of e together) 

T \E?=l(fci-ti) , X / i. \ELi(fei-ti) 



\n-(iniax/ ^ ^ V 

(Product is transformed to sum in exponent) 



{k-t) 



( ^ ) . (gC^-*) • e-(^/")-^?=i . ( 1 + L 

V»^-C^max/ ^ / V (fc-t) 

a: 

(As ^ ki — ti = k — t) 



=1 

fc-t 

n — d„ 



< ( ^ ] . g'=-* . g-'=/"-ID?=i di-{ai-U) . gt 



*max 



(Since 1 + x < we have ( 1 + ^ < e ) 

L / L 1 £f=i<^i'(°i-*i) ^ 

II e " 



(Arranging in powers by t and k). 



The desired result follows. 



Lemma 6. Le? n be sufficiently large and let c\ < -j^- Then for all k < ci ■ n we have 



\^n-dmax y — 10 
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Proof. We have the following inequahty: 



< ^ \y^di-{ai-ti)\- e^- 

n — a, I ' 



(di < dmax and fc < ci • n) 

< gd„>ax-ci . 'T- . Er^l ■ jai - tj) _ ^i_ £?=i<irK-'i) 
^ ^max ^ 

(multiplying numerator and denominator with n) 



Here, 



d = - ■ y^di ■ {Ui - ti) > dmin • ^^^^ > rfmin " (1 - Cl) > 1 

n 'r^i n 

1=1 



The last inequahty follows because ci < 0.5 and dmin > 2. Since /(d) = d/e'' ^ is a decreasing function for d> 1, 
we have f{d) < /(dmin • (1 - ci)). Thus, 

grfmax-Cl . ^ ^ d ^ ^ n _ dmin • (1 - Cl) 

= g(dmi„+dmax)-ci . . C^min • (1 - Cl) 

< e2.d„ax-ci . _^ 2 (1 _ < 1 and /(d„,i„) < /(2) = 2/e) 

-nq? 1 / n 1 . . , , . 0.04 

< 2 • e • — ; < — for sufficiently large n and ci < 

0.9 \n-dmeoc-0.9 ^ dmax 

< 0.9 

The desired result follows. 

Lemma 7. For sufficiently large n and ci < 0.2 we have — k — > 1. 
Proof. We have the following inequahty: 

> 2 • Er=i («i - (Since e* '^/" > 1 and d^ > d^in > 2) 
= 2 • (n - i) 

> 2 • n • (1 — Cl) (Since t < ci • n) 
>1.6-n (ci<0.2) 

Also, n — dmax < 1.6 • n for large n. The desired result follows. 

Now we prove the bound on &fei,fc2v">fcx* 
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Lemma 8 (Upper bound on bki,k2,---,ka,)' There exists ci > such that for sufficiently large n and t < k < ci ■ n, 

we havebk,,k2,...,ka, < (jo) ■ 

Proof. Let < ci < ^ < 0.2 as in Lemma 6. By Lemma 5 we have 



bkuk2,...,K < - — ^ • - — ^ e 



By Lemma 7 we have (^ n-d^^. ) — ^' hence ^ ^ 1- By Lenrnia 6 we have 



, £?=idrK-ti) 9 



n - C^max 10 

The desired result follows trivially. I 
Taking appropriate bounds on the value of k, we get an upper bound on aki,k2,...,k^- 

Lemma 9 (Upper bowid on aki,k2,...,kj' There exists ci > such that for t < ci ■ n and for all 30 • a; • log(n) < 
k<ci-n,we have ak^,k2,...,k^ < where x = (rfmax - dmin + 1)- 

Proof By Lemma 4 we have afei,fc2,...,A;^ < bki,k2,...,k^ and by Lemma 8 we have fefcj,^^,...,^;^ < (^)''. Thus for 
fc > 30 • a; • log(n), 

k <(—] = n30-^-i°s(9/io) < ^ 



ak.,k2,...,k. S J =n--^ <^ 

The desired result follows. I 

Lemma 10 (Main lemma for small k). There exists ci > such that for t < ci ■ n and for all 30 • .t • log(n) < k < 
C\ ■ n, the probability that the size of the reverse reachable set S is k is at most where x — (dmax ~ c'min + 1)- 

Proof. The probability that the reverse reachable set is of size k is given by q^.. By Lemma 3 and Lemma 9 it follows 
that the probability is at most ■ n~^'^ = n~^'^ < The desired result follows. I 



3.2 Large fe: ci • n < fe < C2 • n 

In this section we will show that for all constants ci and C2, with ci < C2, when t < C2 • n the probabihty ak is at 
most ^ for all ci • n < fc < C2 • n. We start with a few notations that we will use in the proofs. Let ai = Pi ■ n, ti = 
Ui ■ n, ki = Si ■ n ioi 1 < i < X and fc = s • n for ci < s < C2. We first present a bound on afei,fe2,...,fej, . 

Lemma 11. For all ki,k2, . ■ . ,kx such that '^t<k<a- ^» ~ ^' ^^^^ ci ■ n < k < C2 ■ n, we have 

afei,fe2,. -,fe« < + 1)"^ • Termi • Term2, 



where 



and 



Termi 



n 



Pi - Vi 
Si - Vi 



Si-Vi 



Pi - Vi 
Pi - Si 



Pi —Si 



(1 _ sf'iP'-'')(i - (1 - sf^y^-y^ 



Term2 = JJ 



i=l 



(1 - sr 




ri.{Si-Vi) 
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Proof. We have 



0'ki,k2,...,kx — 



n 



hi t; 



< l[{at-ti + l)- 



hi ti 



ki-t 



R{ki,k2,...,kx) 



at - ti 
(li ki 



(V) 

a) 



R{ki,k2,...,ka: 



(Applying second inequality of Proposition 1 with £ = ai — ti and j = ki — ti) 



<(n+ir. n 



di ti 
ki ti 



ki-ti 



di ti 

Ui — ki 



ai — ki /In 



-k\ \ 



a) 



R{ki,k2,...,k:,). 



Proposition 1 is presented in the technical appendix. The last inequality above is obtained as follows: (a, — + 1) < 
n + 1 as < n. Our goal is now to show that 



a-i. - ti 

hi ti 



ki-ti 



Qji hi 



-ki / (n—k\ 



a^—ki 



id) 



R{ki, ^2, . . . , fcx) < Termi • Term2. 



We have (i) ai — ti = n{pi — yi); (ii) ki — ti = n{si — yi); and (iii) ai — ki = n{pi — Si). Hence we have 



n 



ai - 1, 



i=i 

By Lemma 2 we have 
By Lemma 1 we have 
Hence we have 



ki ti 



ki-ti 



di ti 
di ki 



ai—ki X 



n 

i=l 



Pt - yt 
Si - yi 



n{si-yi) 



Pi - yi 

Pi - Si 



n(pi-Si) 



n W 



i=l 



•n-{pi — Si) 



R{ki,k2,...,k,)<l[[l- (l- 



y <n 



Pt - yt 



V «j - Vi 



n 

i=l 



Pi - Vi 

Si - yi 



-i{si-Vi) 



n(si-yi) 



n 



Pt - yt 



Pt - yt 

Pi - Si 



Pi - Vt 

Pi - Si 



Pi - Vi 

Pi - Si 



i=l 



n{pi—Si) 



k 



1-'- 
n 



n — di 



din{pi—Si) 



n{si-Vi) 



1-1- 



n{'pi~Si ) 



{l-s) 



din(pi — Si) 



1- 1 



(l-s) 



di{pi 



n 1 



i=l 



n — di 
s 

1 — di/n 
a 

1 



) 



1 — di/n 



X I 

n 



1-1 



Si-Vi 



Pi - Vi 



X2 



Pi— Si 



(l_s)'i.(P.-^.J(l_(l_5)'i^) 



Pi - Si 

di \ "(^i-Vi) 



di\Si-yi 



\ 



i-(i-.r 
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The last equality is obtained by multiplying (1 — (1 — s)*)"^** to Xi and dividing it from X2. Thus we obtain 
Y < Termi • Terma, and the result follows. I 

/l-(l-T3i^) \ 



Lemma 12. Term2 of Lemma 11, i.e., 



Proof. We have 



(1-s) 



< 



< 



is upper bounded by a constant. 



J 



T-iSi-Vi) 



i-(i-^0'^- + (t)-^-(i-^) 



(for sufficiently large n) 

\ n(si-yi) 



1 - (1 - s)* / 
(taking first two terms of bionomial expansion) 



111 



= 1 + 



l-(l-s)<^i 



• 2sdi 



n{si-Vi) 



■2sdi-(si-yi) 



((l + x) <e^). 



Since Ci < s < C2 we have s is constant, and similarly dmin < di < dmax and hence di is constant. Hence it follows 
that the above expression is constant and hence the product of those terms for 1 < i < a; is also bounded by a constant 
(since x is constant). The result follows. I 

Lemma 13. There exists <r] <1 such that Termi of Lemma 11 is smaller than rf' (exponentially small), i.e.. 



n 



Vi - Vi 
Si - Vi 



Pi - Vi 
Pi - Si 



Proof. Let 



f{di) 



Pi - Vi 



.Si-Vi 

Note that f{di) is maximum when 



Pi - Vi 

Pi - Si 



daj{di) = o^d* = 



(1 - s)''-(P'-''-)(i _ (1 _ sy-y^-v-j < rf 

(1 _ 5)*(Pi-«i)(l _ (1 _ s^d.yi-vi 



log(l - s) 



Moreover, f{d*) = 1 f{di) < 1. If rf, = rf* for all i, we have 



di>2^ {1- sf > 



Pi 



Pi - Vi 



{l-sf> ^^^P' 



1 - s 



T,i(Pi -Vi) 1 - y 



(l-s)(l-y)>l 



where the last inequahty is impossible to satisfy since < s < 1. Hence there exists i* such that f{di*) < 1. Since 
di* e [rfmin, dmax] has & compact domain and / is a continuous function, there exists a constant r] < 1 such that 
f{di» ) <ri. Since /(rfj) < 1 for all i, we have HiLi fi^i) < V- The result thus follows. I 

Lemma 14 (Main lemma for large k). For all constants ci and c-j with c\ < c^, for all c\ ■ n <k < Ci ■ n, and for 

all t < C2 ■ n, the probability that the size of the reverse reachable set S is k is at most 

Proof. By Lemma 11 we have aki,k2,...,k^ < ('^ + 1)"^ • Termi • Term2, and by Lemma 12 and Lemma 13 we have 
Term2 is constant and Termi is exponentially small in n, where x = (dmax — c^min + !)• The exponentially small 
Termi overrides the polynomial factor (n + 1)^ and the constant Term2, and ensures that aki,k2,..;kx ^ n~^^. By 
Lemma 3 it follows that a/. < n~^^ < I 
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3.3 Very large fc: (1 — l/e'^)n to n — dmin — 1 

In this subsection we consider the case when the size k of the reverse reachable set is between (1 — ^) • n and 
n — dmin — 1- Note that if the reverse reachable set has size at least n — dmin. then the reverse reachable set must be the 
set of all vertices, as otherwise the remaining vertices cannot have enough edges among themselves. Take m = n—k. 

Hence rfmin + 1 < m < n/e^. As stated earlier, in this case ^2 fe^ becomes small since we require that the m, 
vertices outside the reverse reachable set must have all their edges within themselves; this corresponds to the factor of 

(("<i^'^)/(d )) • Siiice m is very small, this has a very low probability. With this intuition, we proceed to show 
the following bound on ai~^^k2,...,k^- 
Lemma 15. Wehaveak^^k 
Proof. We have 




■ R{ki,k2,...,kx) 

(Ignoring probability value R{ki,k2, . . . ,kx) < 1) 



(Since 



) 



(By Lemma 2) 

di{ai — ki) 

(Inequality 1 of Proposition 1) 

X 

(Since di >2 and ^^(cii — h) = m) 



Proposition 1 is presented in the technical appendix. Since for all i we have {fli — U) < n — t, it follows that 
n?=iK-*i)"*"''* < rifci = (n-t)'". We also wantalower bound for n?Li(ai-fci)"*"''*-Note that 

X!r=i('^« — ki) = m is fixed. Hence, this is a problem of minimizing JliLi Vi^^ given that X]i=i Vi = fnis fixed. As 
before, this reduces to dy^ OiLi Vi^' = ^^vb Y\i=i Vi^^' for ^) ^- Hence, the minimum is attained at yi = m/x, for 
all i. Hence, Yli=iic^i - > )™- Combining these. 



< 



(^) 

/ m 
[x ■ e- — 

V n 



n 

i=l 



2m 



y 

n I 



Hence we have the desired inequality. I 

We see that {x ■ e - ™ is a convex function in m and its maximum is attained at one of the endpoints. For m = 
n/e^, the bound is exponentially decreasing with n where as for constant m, the bound is polynomially decreasing in 
n. Hence, the maximum is attained at left endpoint of the interval (constant value of to). However, the bound we get 
is not sufficient to apply Lemma 3 directly. An important observation is that as m becomes smaller and smaller, the 
number of combinations Y^ki = k, ti < ki < ai in the expression of ak also decrease. Thus, we break this case into 
two sub-cases; dmax + 1 < m < n/e^ and dmin + 1 <m < dmax + 1- 
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Lemma 16. For rfmax + 1 < m < n/e^, we have a/si,/s2,...,fe» < anJ afe < 

Prao/ As we have seen, we only need to prove this for the value of m where aki,k2,...,kx attains its maximum i.e. 
m = dmax + 2. Note that dmax + 1 = a; + dmin > X + 2. Hence, 

x-e-—j (By Lemma 15) 

< a; • e • 



n 

= {x-e- (dmax + 2))''— +2 . n-('*°»+2) 

< n~ (Since first term is a constant) 

Hence we obtain the first inequality of the lemma. By Lemma 3 and the first inequahty of the lemma we have a/; < ^ . 
I 

Lemma 17. There exists a constant h such that for dmin + 1 < m < dmax + 1, we have aki,k2,...,kx < ^ " '^nd 
Proof. By Lemma 15 we have 



V n / 
<{x-e- (rfma^ + l))<i-«+i . n" 



Let h = {x ■ e - (rfmax + Hence, first part is proved. 

Now, for the second part, we note that since there are m vertices outside the reverse reachable set, and all their 
edges must be within these m vertices, they must have degree at most m — 1. Hence, there are now n vertices with 
at most m — dmm distinct degrees. Hence, in the summation ak = J2j2 ki=k ti<ki<ai «fci,fc2,- -,fc«' there are at most 
j^m-dmin terms. Thus we have 

ak < n™-'^""° • h ■ n"™ = h ■ n-"^-'" < 4- 
The desired result follows. I 



Lemma 18 (Main lemma for very lai^e k). For all t, for all {1 — ^) ■ n < k < n — 1, the probability that the size 
of the reverse reachable set S is k is at most 0(7^)- 

Proof. By Lemma 16 and Lemma 17 we obtain the result for all(l — ^)-n<A:<n — rfmin — 1- Since the reverse 
reachable set must contain all vertices if it has size at least n — dmin. the result follows. I 

3.4 Expected Number of Iterations and Running Time 

From Lemma 10, Lemma 14, and Lemma 18, we obtain that there exists a constant h such that 

• log(n) <k <n- dmax - 1 

max 1 ^ /i) ^ 77- (^min 1 
n — C^min < k < n — 1 

Hence using the union bound we get the following result P(|S'| < 30 • x • log(n) or jS'l = n) > 1 — ^, where S is the 
reverse reachable set of target set (i.e., with probability at least 1 — ^ either at most 30 • x • log(n) vertices reach the 
target set or all the vertices reach the target set). Let I{n) and T{n) denote the expected number of iterations and the 







1 




ak 


< 




so- 




< 


h 




ak 




il — 


ak 







n 
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expected running time of the classical algorithm for MDPs on random graphs with n vertices and constant out-degree. 
Then from above we have 

I(n) < ( 1 - - ) • 30 • X • log(n) + - • n 
\ n J n 

It follows that J(n) = 0(log(n)). For the expected running time we have 

T(n) < - - V (30 • X • log(n))2 + - • 

\ n J n 

It follows that T(n) = 0(n). Hence we have the following theorem. 

Theorem 1. The expected number of iterations and the expected running time of the classical algorithm for MDPs 
with Buchi objectives over graphs with constant out-degree are at most 0(log(n)) and 0{n), respectively. 



4 Average Case Analysis in Erdos-Renyi Model 

In this section we consider the classical Erdos-Renyi model of random graphs Qn.p, with n vertices, where each edge 
is chosen to be in the graph independently with probability p [1 1] (we consider directed graphs and then Qn,p is also 

referred as Vn.p in literature). First, in Section 4.1 we consider the case when p is r2(i2M!i)), and then we consider the 

case when p = ^ (that generates the uniform distribution over all graphs). We will show two results: (1) if p > '^''"^^"^ ^ 
for any constant c > 2, then the expected number of iterations is constant and the expected running time is linear; 
and (2) if p = i (with p = ^ we consider all graphs to be equally likely), then the probability that the number of 
iterations is more than one falls exponentially in n (in other words, graphs where the running time is more than linear 
are exponentially rare). 



4.1 g„,^withj, = r2(l^) 

In this subsection we will show that given p > for any constant c > 2, the probabiHty that not all vertices can 

reach the given target set is less than 0(l/n). Hence the expected number of iterations of the classical algorithm for 
MDPs with Biichi objectives is constant and hence the algorithm works in average time linear in the size of the graph. 
Observe that to show the result the worst possible case is when the size of the target set is 1, as otherwise the chance 
that all vertices reach the target set is higher. 

The probability R{n,p). For a random graph in Qn,p and a given target vertex, we denote by R{n,p) the probability 
that each vertex in the graph has a path along the directed edges to the target vertex. Our goal is to obtain a lower 
bound on R{n,p). 

The key recurrence. Consider a random graph G with n vertices, with a given target vertex, and edge probabiUty p. 
For a set K of vertices with size k (i.e., \K\ = k), which contains the target vertex, R{k, p) is the probability that each 
vertex in the set K, has a path to the target vertex, that lies within the set K (i.e., the path only visits vertices in K). 
The probability R{k,p) depends only on k and p, due to the symmetry among vertices. 

Consider the subset S of all vertices in V, which have a path to the target vertex. In that case, for all vertices v in 
V \ S, there is no edge going from v to a vertex in S (otherwise there would have been a path from v to the target 
vertex). Thus there are no incoming edges from V\S to S. Let l^l = i. Then the « • (n — i) edges from V\S to S 
should be absent, and each edge is absent with probability {1—p). The probability that each vertex in S can reach the 
target is R{i,p). So the probability of S being the reverse reachable set is given by: 

{i-pr^^-'^-R{i,p). (2) 

There are ("Zi ) possible subsets of i vertices that include the given target vertex, and i can range from 1 to n. Exactly 
one subset SofV will be the reverse reachable set. So the sum of probabiUties of the events that S is reverse reachable 
set is 1. Hence we have: 

^ = j2(^"^l)■i^-pr'''-'^■Ri^,p) (3) 
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Moving all but the last term (with i = n) to the other side, we get the following recurrence relation: 

n-l 



ri-i / _ i\ 



(4) 



Bound on p for lower bound on R{n, p). We wiU prove a lower bound on p in terms of n such that the probabihty that 
not all n vertices can reach the target vertex is less than O ( 1 /n) . In other words, we require 

R{n,p)>l-o(-) (5) 



Since R{i, p) is a probability value, it is at most 1 . Hence from Equation 4 it follows that it suffices to show that 



E ( • :0 • (1 -^^)^-^""^ • ^(*'^) ^ E ( • iiO • (1 -^^)^-^""^ ^ o (i) (6) 

to show that R{n,p) > 1 — O (^). We will prove a lower bound on p for achieving Equation 6. Let us denote by 
U = ("Zi) • (1 — p)*'^""'^. The following lenmia establishes a relation of U and tn-i- 



Lemma 19. We have tn-i = • k. 
Proof. We have 



( "T\)(l-p)Hn-0 
\n — I — IJ 

(1 - 



n-l 

n — i / n — 1 



z V i — 1 

n — i 



ti 



The desired result follows. 

Let gi = ti+ tn-i- From the previous lennma we have 

I I \i — 1 J \i J 

We now establish a bound on gi in terms of t\. In the subsequent lemma we establish bound on t\. 

Lemma 20. Ifp > S±^^ with c> 2, then for all2<i<^we have g^ < h. 

Proof Letp > si^M with c> 2. Now 

i = - n^.(l-p)(i-i)-(n-i-i) (R^™ging powers of (1 - p) and ( T ) < n^) 

> . -e.io.(„)^ il-x< e--) 



n' • e 



■(i-l)-(n-j-l) 
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To show that ti > gfj, it is sufficient to show that 

C % ' Tl 

-■{i-l)-{n-i-l)-i>G^ — - < c 

n [i — 1) ■ [n — I — 1) 

Note that f{i) = is convex for 2 < j < n/2. Hence, its maximum value is attained at either of the 

endpoints. We can see that 

2 • n 

f{2) = < c (for sufficiently large n and c> 2) 

Tl O 

and ^ 

/(n/2) = 2 • ^ — ""2) ~ ^ ^^^^ sufficiently large n and c > 2) 
The result follows. I 
Lemma 21. Tfp > ^''"^^"^ wj?/j c> 2, ;/ien ti < /or n sufficiently large. 

Proof. We have ti = (1 - Forp > we have 

< e-2 '°e(") = (for sufficiently large n, 2) 

Hence, the desired result follows. I 

We are now ready to estabhsh the main lenmia that proves the upper bound on R{n, p) and then the main result of 
the section. 

Lemma 22. For sufficiently large n, for all p > '^'^"^^"^ with c> 2, we have R{n, p) > 1 — 
Proof We first show that Yl^=i < We have 

n-l n-2 



ti=ti+ tn-i + ^2 

i=l i=2 

(n-2)/2 

= ti+tn-l+ ^ Qi 

i=2 
(n-2)/2 

En 
gi (We apply U + tn-i = - - U with i = 1) 
I 

i=2 
(n-2)/2 

<n-ti+ ^ tl (By Lennma 20 we have Qi < ti) 

i=2 

3-n 

= — -'^ 

3-n 1 
< ^ (By Lennma 21 we have ti < 

2 ■ n^ 

By Equation 6 we have that R{n,p) > 1 — J2i^i ti. It follows that R{n,p) > I — I 

Theorem 2. 77ie expected number of iterations of the classical algorithm for MDPs with Biichi objectives for random 
graphs Gn,p, with p > ^ '"^^"^ where c > 2, is 0{1), and the average case running time is linear 
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Proof. By Lemma 22 it follows that R{n,p) > 1 — and if all vertices reach the target set, then the classical 
algorithm ends in one iteration. In the worst case the number of iterations of the classical algorithm is n. Hence the 
expected number of iterations is bounded by 

n J n 

Since the expected number of iterations is 0(1) and every iteration takes linear time, it follows that the average case 
running time is linear. I 

4.2 Average-case analysis over all graphs 

In this section, we consider uniform distribution over all graphs, i.e., all possible different graphs are equally likely. 
This is equivalent to considering the Erdos-Renyi model such that each edge has probability ^. Hence we consider 
1 and show that the probability that not all vertices reach the target is exponentially small in n. It will follow that 
MDPs where the classical algorithm takes more than constant iterations is exponentially rare. We consider the same 
recurrence R{n,p) as in the previous subsection and consider tk and gk as defined before. The following theorem 
shows the desired result. 

Theorem 3. InQ^ i with sufficiently large n the probability that the classical algorithm takes more than one iteration 
is less than (f )"■ 

Proof. We first observe that Equation 4 and Equation 6 holds for all probabilities. Next we observe that Lenmia 20 
holds for p > , and hence also for f> = 5 for sufficiently large n. Hence by applying the inequalities of the 

proof of Lemma 22 we obtain that 

n-1 „ 

Ed ■ n 
U< — -h. 

i=l 

FoTp = 5 we have ti = ("q ^) • (l — 5)" ^ = Hence we have 



, 3-n 1.5" /3\" 

R(n,p) > 1 - — - — - > 1 - -— = 1 - - ' 



The second inequality holds for sufficiently large n. It follows that the probabiUty that the classical algorithm takes 



more than one iteration is less than (|)". The desired result follows. 



5 Conclusion 



In this work we present the average case analysis of the classical algorithm for qualitative analysis of MDPs with Biichi 
objectives. Both for the general case and the important special case of MDPs with constant out-degree we establish 
that the average case running time is linear, as compared to the quadratic worst case complexity. Moreover, as for the 
improved algorithms it has been established that they require at most linear time more than the classical algorithm, 
it also follows that the average case running time of all the improved algorithms are also linear. As our results show 
many interesting analysis of random directed graphs and reachability properties, we believe our results will have other 
applications in average case analysis of related problems in verification. 
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6 Technical Appendix 



Proposition 1 (Useful inequalities from Stirling inequalities). For natural numbers I and j with j < iwe have the 
following inequalities: 



1 < [f^ 

2- (^)^(^+i)-(fy-(^y 

Proof. The proof of the results is based on the following inequaUty (Stirhng inequahty) for factorial: 

We now use the inequality to show the desired inequalities: 
1. We have 



< 



< 



e- p 



(using Stirling inequahty) 



1 

< - 

e 



< 



J 



2. We have 



j\-{n-j)\ 



< e- 



I 



\ 



<l.(£+l)-e 



J 

£ + 1 



<{£+!) 



£-J 



£~J 
l\3 

3 



£-3 

t-j 



I 



t-3 



e-j 



(since ^ 



1 + T <e 



The first inequality is obtained by applying the Stirhng inequality to the numerator (in the first term), and applying 
the Stirling inequahty twice to the denominator (in the second term). 
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